#Here's the library load and the loading of the various raw files
library(DT)
library(readxl)
library(tidyverse)
library(mosaic)
library(ggplot2)
library(dplyr)

#Load Datasets
PraxisRaw <- read_excel("C:/Users/bpacini/Desktop/DissertationResearch/Data/PraxisData.xlsx")
CertificationRaw <- read_excel("C:/Users/bpacini/Desktop/DissertationResearch/Data/CertificationColumnRaw.xlsx")
DispositionRaw <- read_excel("C:/Users/bpacini/Desktop/DissertationResearch/Data/Disposition.xlsx")
DanielsonRaw <- read_excel("C:/Users/bpacini/Desktop/DissertationResearch/Data/Danielson.xlsx")
DemographicsByStudentRaw <- read_excel("C:/Users/bpacini/Desktop/DissertationResearch/Data/DemographicsByStudent.xlsx",)
                                       #Code for changing column types is here, but don't forget to remove the ending parenthesis in the line above:  col_types = c("numeric", "text", "text", "text", "numeric", "numeric", "numeric", "numeric", "numeric", "text", "text","numeric", "numeric", "text",      "text", "text", "numeric", "text", "text", "numeric", "numeric", "text", "text", "text", "numeric", "numeric", "text", "text", "text", "text", "text", "numeric", "text", "numeric", "text", "numeric", "numeric", "text", "text",     "text", "text", "numeric", "numeric", "numeric"))
PraxisDummy <- read_excel("C:/Users/bpacini/Desktop/DissertationResearch/Data/DummyData/Praxisdummy.xlsx")
CertificationDummy <- read_excel("C:/Users/bpacini/Desktop/DissertationResearch/Data/DummyData/CertificationDummy.xlsx")
DispositionDummy <- read_excel("C:/Users/bpacini/Desktop/DissertationResearch/Data/DummyData/DispositionDummy.xlsx")
DanielsonDummy <- read_excel("C:/Users/bpacini/Desktop/DissertationResearch/Data/DummyData/DanielsonDummy.xlsx")
DemographicsByStudentDummy <- read_excel("C:/Users/bpacini/Desktop/DissertationResearch/Data/DummyData/DemographicsDummy.xlsx",)

I: Overview

This document constitutes an exploration of the data I will be using for dissertation. It includes example data, summaries, limitations, data cleaning protocols, code, and analysis that is limited to investigating raw data (that is, no advanced statistical techniques).

Project Overview

My dissertation aims to explore two fundamental questions.

  1. What are the teacher-candidate characteristics that are associated with success in becoming an effective teacher?
  2. How do the various measures of what it means to be a successful teacher validate one another (or not validate one another?)

But how do we define successful teaching?

The gold standard for educational research is test scores: while there is much more than mere test scores that goes into a great education, fundamentally if a child cannot read then they aren’t getting what they deserve from their education. What’s more, as Chetty has found, test scores bear strong relationships to other important variables such as income in later life, incarceration rates, and teen pregnancy. Unfortunately, Idaho does not have a data infrastructure to track how different teacher preparation programs (TPPs) are able to graduate different quality teachers.

Fortunately, there are a number of variables that I do have access to that are effective proxies for quality teaching, namely:

  • Praxis scores.
  • Certification status.
  • Graduation with a teacher education degree.
  • Final evaluation during student teaching of teaching practice.
  • Self-surveys of teaching practice for graduates of one year.
  • Administrator surveys of teaching practice for graduates of two years.

My purpose is to investigate each of these outcome variables to see what predicts each. My methodology will be multiple regression analysis for all of these.

This constitutes research question 1.

Research question 2 is focused on validating my six outcomes variables to see the extent to which they are related to one another.

For more information, see my comprehensive exam paper here.

About the Author

I’m a professor of education in the Department of Elementary, Early, and Special Education. This is the best job I could imagine. I have four wonderful kids, and an incredible wife–and they are one of many reasons that I’d like to get this dissertation done as quickly as I can!

“Kelsier is awesome.”
“Kelsier is awesome.”

Acknowledgements

I never quite realized just how big of a community project a good dissertation was. Big thanks to my committee members (and past committee members) Dr. Heidi Holmes Erickson, Dr. Bryan Bowles, Dr. Isaac Calvert, Dr. Donald Baum, and Dr. Pamela Hallam. Huge shoutout to the many friends who cheered me on, supported me, or gave me ideas, and to the group of RAs who did phenomenal work putting something amazing together: Luke Russell, Kelin Tang, Alica Pao, Spencer Driggs, and Alina Rojas. I can’t say enough about the help of J. Hathaway and Garrett Saunders–I quite literally could not have done this without them.

And most importantly, to my kids and my wife, I promise to make every minute of this time back to you with interest. May future generations find a way to be more wise than to steal a father from their family for something of comparatively so little worth.

Literature Review

My full literature review is available at the link to my comprehensive exam paper, here. I’ve also provided a review in brief below.

At very best, research to date reveals only modest ability to predict what makes for a future effective educator. Individual predictive ability for various common factors are mixed or minimal, including licensure exams, college GPAs, surveys, and observational assessments using rubrics. In isolation, these show only small–and often inconsistent–relationships with teaching effectiveness. Cognitive skills and academic performance are marginally better predictors, but effects are still minimal. No single factor demonstrates strong, consistent predictiveness alone though there is a possibility that analyzing several factors together may provide a better model of predictability.

There is currently a minor maelstrom of debate brewing over the usefulness of teacher preparation programs generally: does licensure and traditional credentialing have any quantifiable value whatsoever? The research literature is mixed, but the results are far smaller than one would hope given the expense and complicated nature of the teacher credentialing infrastracture we have presently built.

The purpose of my research, then, is to put an upper bound on individual factors’ predictive ability–such as high school and college GPA, credentialing status, ACT/SAT, and any other factor that might yield insights about future teaching ability–and to assess whether or not factors can actually predict reaching teacher status, and to cross validate these various factors against one another. This will have meaningful application for education departments throughout the country as they seek to focus their limited resources on what is most important for K-12 students. What’s more, it holds significant meaning for policymakers as they think through crafting effective systems, rules, and legislation.

II: Description of Data Sources

Data sources are from a large private university in the intermountain west, and contain data for university students in education programs at said university (hereafter, ‘teacher candidates’).

Note that the data below are dummy data for privacy and data integrity reasons, but enough to give a sense for what data I really have access to.

Demographics

General demographic information. This data set is pulled from a combination of the university student information system, the student teaching office, and the admissions office and contains fields such as ethnicity, age at first semester, gender, and ACT/SAT scores.

datatable(DemographicsByStudentDummy, caption="Demographics Raw Dataset", options = list(pageLength = 10), extensions="Responsive")

Praxis

Praxis data. Praxis is the state-mandated licensure exam series, and all education students must take and pass in order to gain credentialed status. This includes data such as test dates, scores, passing status, and student ID linking this data to all other data sets.

datatable(PraxisDummy, caption="Praxis Raw Dummy", options = list(pageLength = 10), extensions="Responsive")

Observations

Danielson’s “Framework For Teaching” scores. BYU-Idaho is required by the state of Idaho to use the FFT for scoring future teachers. This is a rubric of four domains and four levels of scores (Unacceptable, basic, proficient, and distinguished, though most teachers and administrators colloquially refer to them by their numeric equivalents of 1, 2, 3, and 4.)

datatable(DanielsonDummy, caption="Danielson Raw Dummy", options = list(pageLength=10), extensions = "Responsive")

Dispositions

Disposition’s Evaluations. These are performed on a semester basis by faculty at BYU-Idaho on a proprietary rubric that includes eight facets including social, emotional, and cognitive presence, compliance, and dedication to teaching.

datatable(DispositionDummy, caption="Dispositions Raw Dummy", options = list(pageLength = 10))

Certification

Certification data. This is from a set of data maintained by student teaching services. The codes and information are not systematized, and so some entries originally included “XX” or “Cert” or “Signed” and have been carefully translated into a 0/1 binary variable.

datatable(CertificationDummy, caption="Certification Raw Dummy", options = list(pageLength = 10))

Surveys

Graduate and Employer Surveys. BYU-Idaho employs two standard surveys of graduates that is administered annually: a survey of teacher education graduates one year after graduation, and a survey of employers of our graduates two years after graduation. Graduates self rank (and are ranked by administrators) on a number of core teaching proficiencies.

(Note: this data are still incoming.)

III.I: Basic Analysis: Demographic Data

Visualization

Numerical Summary

Data Cleaning

Our rich dataset of demographic data was too bulky and unwieldy to use intelligently. As such, I decided

DemographicsByStudent <- DemographicsByStudentRaw %>%
  select(StudentID, ServedMission, Ethnicity, Gender, CurrentCumGPA, 
         ACTComposite, SAT2016Composite, HighSchoolGPA, ReceivedPell,
         CurrentAdmissionStatus, FirstSemAttended, FirstSemAge, 
         FirstSemMaritalStatus, FirstSemMajorCode, FirstSemMajor, AssociateDegreesEarned)
#Note no remove duplicates function happening here

III.II: Basic Analysis: Praxis

Visualization

Numerical Summary

Praxis data in my data set includes 18,409 records, each of which is a test attempt. Basic statistical information is available below:

min Q1 median Q3 max mean sd n missing 1 156 168 179 780 169.4862 40.49006 18409 0

While this basic information may seem straightforward, comparing test pass rates yields interesting information that can help a university to improve passing rates on challenging tests. For example, when breaking out test pass rates by test, we get the following:

Data Cleaning

In order to ask useful questions, some data cleansing and structuring were in order. First, I learned to left_join my demographic and praxis datasets so that I could analyze demographic data’s relationship to Praxis data. I also filtered out any records that contained no Praxis data for this round of analysis. Finally, I created a new column that mutated a new column that was simply a ratio of successful attempts over total attempts. This yielded a pass rate that is a useful proxy for general success on the Praxis.

#mergefunction
PraxisMerged <- left_join(PraxisRaw, DemographicsByStudentRaw, by = "StudentID")

#FilteroutNAs
PraxisMerged <- PraxisMerged %>%
  filter(!is.na(FirstSemMajor))

#Create_Pass_Rate
DemographicsByStudentRaw <- DemographicsByStudentRaw %>%
    #select(StudentID, AttemptedPraxisTests, PassedPraxisTests) %>%
    mutate(PassRate = PassedPraxisTests / AttemptedPraxisTests, na.rm = TRUE)
#I think it's creating a column called na.rm=TRUE which I don't want, I just want it to ignore missing values. Alas.

Next I began to take a basic look through the data. I began with summary statistics of my data, using a great little function called “favstats.”

I then proceeded to create an overall histogram that has some skewness and kurtosis, but is overall very close to a normal curve. the histogram also gives a good overview of frequency and total tests attempted.

hist(
     PraxisMerged$Score, 
     col = 'skyblue3',
     breaks = seq(min(PraxisMerged$Score), max(PraxisMerged$Score) + 5, by = 5),
     xlim = c(120, 220))  # Adjust the range based on your preference

Next, I was interested in finding out whether different Praxis tests are taken by candidates of a different student profile. I wanted to get the mean and summary information of several variables including high school GPA and admissions test scores to see whether the populations were different, and if so, if they were different in statistically significant ways. They were–I ran a grouped series of pair-wise t-tests, and over 90% of them came back significant. I had to reject the null hypothesis and assume that the means are very different indeed.

# Fit a multiple linear regression model
model <- lm(PassRate ~ FirstSemMajor + FirstSemMaritalStatus + FirstSemAge + ReceivedPell + HighSchoolGPA + SAT2016Composite + ACTComposite + CurrentCumGPA + Gender + Ethnicity + ServedMission, data = DemographicsByStudentRaw)

# Summarize the model
summary(model)
## 
## Call:
## lm(formula = PassRate ~ FirstSemMajor + FirstSemMaritalStatus + 
##     FirstSemAge + ReceivedPell + HighSchoolGPA + SAT2016Composite + 
##     ACTComposite + CurrentCumGPA + Gender + Ethnicity + ServedMission, 
##     data = DemographicsByStudentRaw)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.70435 -0.04813  0.00168  0.10524  0.30015 
## 
## Coefficients:
##                                               Estimate Std. Error t value
## (Intercept)                                  0.1452143  1.0359316   0.140
## FirstSemMajor399 - General Studies          -0.0056552  0.2803258  -0.020
## FirstSemMajor415 - Business Management      -0.0482227  0.2978896  -0.162
## FirstSemMajor465 - Civil Engineering         0.0925837  0.3712635   0.249
## FirstSemMajor481 - Exercise Physiology      -0.4278649  0.3571614  -1.198
## FirstSemMajor489 - Biomedical Science        0.0071076  0.3125596   0.023
## FirstSemMajor600 - Accounting                0.1215300  0.3549118   0.342
## FirstSemMajor611 - Recreation Management     0.0807860  0.3551912   0.227
## FirstSemMajor623 - Political Science        -0.4563931  0.3205406  -1.424
## FirstSemMajor625 - History                   0.1693077  0.3609941   0.469
## FirstSemMajor648 - Food Sciences             0.0366766  0.3481164   0.105
## FirstSemMajor655 - Dance                     0.2500469  0.3575804   0.699
## FirstSemMajor660 - Art                       0.2479083  0.3060657   0.810
## FirstSemMajor675 - Communication            -0.1532892  0.2953840  -0.519
## FirstSemMajor700 - Biology                   0.2373683  0.3136508   0.757
## FirstSemMajor710 - Chemistry                 0.2251896  0.3499982   0.643
## FirstSemMajor751 - Public Health             0.0289673  0.3085551   0.094
## FirstSemMajor760 - Psychology                0.1689749  0.3133618   0.539
## FirstSemMajor770 - Physics                  -0.2229799  0.3517822  -0.634
## FirstSemMajor780 - Sociology                -0.3152905  0.3697748  -0.853
## FirstSemMajor800 - Biology Education         0.0644335  0.3609076   0.179
## FirstSemMajor815 - History Education        -0.0453661  0.3152649  -0.144
## FirstSemMajor830 - English Ed Composite      0.1496905  0.3237357   0.462
## FirstSemMajor836 - English Education         0.2764442  0.3683024   0.751
## FirstSemMajor850 - Mathematics Education     0.1070895  0.3191959   0.335
## FirstSemMajor852 - Math Education Composite  0.0481427  0.2995724   0.161
## FirstSemMajor860 - Art Education             0.0641466  0.3526242   0.182
## FirstSemMajor862 - Art Education Composite  -0.0651277  0.3146506  -0.207
## FirstSemMajor870 - Physics Education         0.0890550  0.3591797   0.248
## FirstSemMajor880 - Social Studies Ed Compos -0.2145369  0.3670878  -0.584
## FirstSemMajor890 - Music Education Composit  0.1086652  0.3046078   0.357
## FirstSemMajor910 - Spanish Education        -0.2146877  0.2786478  -0.770
## FirstSemMajor935 - Theatre Education        -0.6019833  0.3502148  -1.719
## FirstSemMajor940 - Fam & Consum Sci Ed Comp  0.2938728  0.3584516   0.820
## FirstSemMajor980 - Early Child/ Special Ed   0.1849795  0.2931237   0.631
## FirstSemMajor985 - Special Education K-12    0.0465626  0.3426309   0.136
## FirstSemMajor990 - Elementary Education      0.0708952  0.2815432   0.252
## FirstSemMaritalStatusSingle                  0.4545775  0.3384399   1.343
## FirstSemAge                                  0.0045682  0.0357322   0.128
## ReceivedPell                                -0.0520736  0.0474323  -1.098
## HighSchoolGPA                               -0.2799122  0.1227067  -2.281
## SAT2016Composite                             0.0004800  0.0003893   1.233
## ACTComposite                                 0.0116963  0.0130635   0.895
## CurrentCumGPA                                0.1005371  0.0716684   1.403
## GenderMale                                   0.0499800  0.1552856   0.322
## EthnicityOther                              -0.6882095  0.2581815  -2.666
## EthnicityTwo or More Ethnicities             0.2504140  0.1918817   1.305
## EthnicityUnknown                            -0.0627596  0.2693663  -0.233
## EthnicityWhite                               0.0373013  0.1322244   0.282
## ServedMissionTrue                            0.0156683  0.0529897   0.296
##                                             Pr(>|t|)   
## (Intercept)                                  0.88888   
## FirstSemMajor399 - General Studies           0.98396   
## FirstSemMajor415 - Business Management       0.87182   
## FirstSemMajor465 - Civil Engineering         0.80373   
## FirstSemMajor481 - Exercise Physiology       0.23456   
## FirstSemMajor489 - Biomedical Science        0.98192   
## FirstSemMajor600 - Accounting                0.73295   
## FirstSemMajor611 - Recreation Management     0.82067   
## FirstSemMajor623 - Political Science         0.15849   
## FirstSemMajor625 - History                   0.64037   
## FirstSemMajor648 - Food Sciences             0.91636   
## FirstSemMajor655 - Dance                     0.48646   
## FirstSemMajor660 - Art                       0.42041   
## FirstSemMajor675 - Communication             0.60527   
## FirstSemMajor700 - Biology                   0.45145   
## FirstSemMajor710 - Chemistry                 0.52185   
## FirstSemMajor751 - Public Health             0.92544   
## FirstSemMajor760 - Psychology                0.59126   
## FirstSemMajor770 - Physics                   0.52803   
## FirstSemMajor780 - Sociology                 0.39646   
## FirstSemMajor800 - Biology Education         0.85877   
## FirstSemMajor815 - History Education         0.88595   
## FirstSemMajor830 - English Ed Composite      0.64509   
## FirstSemMajor836 - English Education         0.45516   
## FirstSemMajor850 - Mathematics Education     0.73815   
## FirstSemMajor852 - Math Education Composite  0.87274   
## FirstSemMajor860 - Art Education             0.85612   
## FirstSemMajor862 - Art Education Composite   0.83656   
## FirstSemMajor870 - Physics Education         0.80483   
## FirstSemMajor880 - Social Studies Ed Compos  0.56062   
## FirstSemMajor890 - Music Education Composit  0.72225   
## FirstSemMajor910 - Spanish Education         0.44335   
## FirstSemMajor935 - Theatre Education         0.08960 . 
## FirstSemMajor940 - Fam & Consum Sci Ed Comp  0.41481   
## FirstSemMajor980 - Early Child/ Special Ed   0.52984   
## FirstSemMajor985 - Special Education K-12    0.89225   
## FirstSemMajor990 - Elementary Education      0.80185   
## FirstSemMaritalStatusSingle                  0.18312   
## FirstSemAge                                  0.89860   
## ReceivedPell                                 0.27565   
## HighSchoolGPA                                0.02527 * 
## SAT2016Composite                             0.22133   
## ACTComposite                                 0.37336   
## CurrentCumGPA                                0.16464   
## GenderMale                                   0.74842   
## EthnicityOther                               0.00934 **
## EthnicityTwo or More Ethnicities             0.19571   
## EthnicityUnknown                             0.81638   
## EthnicityWhite                               0.77861   
## ServedMissionTrue                            0.76826   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2087 on 78 degrees of freedom
##   (30979 observations deleted due to missingness)
## Multiple R-squared:  0.545,  Adjusted R-squared:  0.2592 
## F-statistic: 1.907 on 49 and 78 DF,  p-value: 0.005325
#This chunk is for model text that I might need later DanielsonNarrow<-filter(Danielson, ED1.3 == "Final (Summative)" & Danielson, Course=="ED492")

#Demo_Cert <- merge(Demographic, Certification, by="StudentID", all.x=TRUE)#I'm not sure I need this, as I won't do it that way, but I'm going to use this as a reminder.view()
#This is where my plots live as examples
ggplot(Demographic, aes(x=ServedMission, y=CurrentCumGPA)) +
  geom_boxplot(fill="skyblue", color="black") + labs( title="GPA and RMs")
favstats(CurrentCumGPA~ServedMission, data=Demographic)
t.test(CurrentCumGPA~ServedMission, data=Demographic)

ggplot(Demographic, aes(x=ACTComposite, y=as.numeric(ServedMission=="True"))) +
  geom_jitter(pch=16, cex=0.1, width=0.2, height=0.1)+
  geom_smooth(method="glm",formula=y~x,method.args=list(family="binomial")) + 
  geom_smooth()
# Assuming 'TestName' is the column indicating the type of test
PraxisMergedSummary <- PraxisMerged %>%
  group_by(TestName) %>%
  summarize(
    MeanHighSchoolGPA = mean(HighSchoolGPA, na.rm = TRUE),
    MeanSATScore = mean(SAT2016Composite, na.rm = TRUE),
    MeanACTScore = mean(ACTComposite, na.rm = TRUE),
    # Add more summary statistics as needed
  )

 # Assuming 'TestName' is the column indicating the type of test
 test_names <- unique(PraxisMerged$TestName)
 
anova_sat <- aov(SAT2016Composite ~ TestName, data = PraxisMerged)
summary(anova_sat)
##               Df   Sum Sq Mean Sq F value   Pr(>F)    
## TestName      43  1787215   41563    2.67 5.78e-08 ***
## Residuals   1127 17543664   15567                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 16939 observations deleted due to missingness
anova_ACT <- aov(ACTComposite ~ TestName, data = PraxisMerged)
summary(anova_sat)
##               Df   Sum Sq Mean Sq F value   Pr(>F)    
## TestName      43  1787215   41563    2.67 5.78e-08 ***
## Residuals   1127 17543664   15567                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 16939 observations deleted due to missingness
anova_HSGPA <- aov(HighSchoolGPA~TestName, data=PraxisMerged)
summary(anova_HSGPA)
##                Df Sum Sq Mean Sq F value Pr(>F)    
## TestName      115   55.8  0.4856   2.957 <2e-16 ***
## Residuals   14957 2455.9  0.1642                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 3037 observations deleted due to missingness
#histogram of p-values
#hist(
 # t_test_results$p_value,
#  breaks = seq(0, 1, by = 0.005),

#  col = 'skyblue3',
#  xlim = c(0, 1))

(Note: Check for assumptions in ANOVA model.) (Note: why are 1 and 2 duplicates? That’s weird!) My next goal is to calculate the R^2 measure for each anova test. The formula for R^2 is

Sum Sq(TestName)/ (SumSq(TestName)+SumSq(Residuals))

Calculating the R^2 measure, we get:

R^2 = (55.8+2455.9)/ 55.8

≈ 2511.7/55.8

≈0.022

And 1787215/(1787215+17543664)≈ 0.092.

In other words, while these f-statistics are highly significant, my initial estimate of percent variation explained by the model is 0.022 and 0.092 respectively–moderate at best.

Based on this analysis, it seems clear that the profile of students who take the secondary English Praxis differs significantly from e.g. those who take the Elementary social studies subtest. At the same time, that variation is not explained by selection on its own, and further analysis might yield more information.

III.III Basic Analysis: Dispositions

Visualization

Numerical Summary

Data Cleaning

In order to tackle the dispositions data, I needed to mutate the data into numerical format rather than characters (i.e. “3” rather than “proficient”).

#Here is the filtering code for the Disposition data
DispositionFiltered <- filter(DispositionRaw, Course=="ED 492")
#Below is the find-and-replace code which also turns does it with characters.
# Replace character values with numeric equivalents
DispositionMutated <- DispositionFiltered %>%
  mutate(across(starts_with("Q"), 
                ~ case_when(
                    .x == "Exemplary" ~ 4,
                    .x == "Proficient" ~ 3,
                    .x == "Developing" ~ 2,
                    .x == "Unacceptable" ~ 1,
                    .x == "Not Observed" ~ 99,
                    TRUE ~ NA)))

III.IV Basic Analysis: Danielson Ratings

Visualization

Numerical Summary

Data Cleaning

The first step in analyzing Danielson data was to clean the data and prepare it for analysis. In this case, that meant filtering for my outcome variable. Rather than select every Danielson score, I limited it to only ED 492 (student teaching) evaluations that were in person and summative.

Additionally, in order to calculate means and summary statistics, I needed to mutate data that read “proficient” to “3” which I did below.

#Here is the filtering code for the Danielson spreadsheet
DanielsonFilteredStage2 <- filter(DanielsonRaw, ED1.3 == "Final (Summative)" & Course=="ED 492" &  ED1.4=="In-person observation")
#Below is the find-and-replace code which also turns does it with characters. Danielson first.
# Replace character values with numeric equivalents
DanielsonMutatedStage3 <- DanielsonFilteredStage2 %>%
  mutate(across(starts_with("Q"), 
                ~ case_when(
                    .x == "Proficient (3)" ~ 3,
                    .x == "Basic (2)" ~ 2,
                    .x == "Unsatisfactory (1)" ~ 1,
                    .x == "Not Observed" ~ 99,
                    TRUE ~ NA)))

# Convert columns to numeric (This is already taken care of above)
#DanielsonMutatedStage3 <- DanielsonMutatedStage3 %>%
  #mutate(across(starts_with("Q"), as.numeric))

III.V Basic Analysis: Self-Surveys

Visualization

Numerical Summary

Data Cleaning

III.VI Basic Analysis: Employer Surveys

Visualization

Numerical Summary

Data Cleaning

#Use favstats on quantitative
#Use favstats on quantitative and categorical using ~
  #favstats(y~x, data=PraxisMerged)
#stripchart seems really interesting too. 
#Build using buttons using Saunders code, include "datatable" followed by #"favstats" and graphical summary.